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ABS TRACT 


A class of finite-memory deterministic algorithms is introduced and 
investigated. Optimum algorithms are found for a small number of states 
(up to 21) and an asymptotic bound on error probability is obtained for 
a large number of states. The algorithms provide their own stopping 


rule. 
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L. INTRODUCTION 


Computers seem to grow more complex and more sophisticated on almost 
a continuous basis. a logical future step in computer development would 
be to provide the computer with some decision making capability. This 
capability of necessity would be limited by the size of the computer core. 
Such a property would be of particular value if the computer were design- 
ed to operate without human assistance. For example, exploration of the 
nearer stars could most easily be accomplished by unmanned spacecraft. 
Yet the immense distances preclude human involvement in any decision pro- 
cess. If a computer with a decision making capability were to be used 
within the spacecraft, it would almost certainly be small and of very 
limited core size. Such an automaton would be required to make decisions 
with minimum probability of error and constrained by available memory. 
One form of a decision process could probably be adapted from the statis- 
tical test of hypothesis. 

In testing a hypothesis, the statistician normally forms the likeli- 
hood ratio and reduces. the ratio to a sufficient statistic which is to be 
less than or greater than some constant k. For a sample of size n; o 
(Probability of Type I error) and B (Probability of Type II error) will 
exponentially approach zero as n becomes large. To apply the procedure 
at time n means that sufficient memory must be available at time n to 
record the observations XX Xero X- In even the simplest cases this 
memory must grow indefinitely with time. Summarizing the data in a suf- 
ficient statistic does not insure that the memory requirement will be re- 
duced. The sufficient statistic is data reducing only in that it maps 


the observations from R° to R, but the cardinality of the memory may be 





at least as great. For example, suppose that an experimenter wishes to 
estimate for a Normal random variable and uses x as a sufficient sta- 
tistic. Further suppose that in 99 repeated trials the experimenter finds 
that Dx = 1.000. Then x = 1.0/99 = 0.0101...., and the memory requirement 
has become infinite., It would be tempting to round off the statistic to 
some finite dimension; however, Cover |1] has shown that a. and BA then 
do not tend to zero. Additionally, the rounded off statistic may not 
converge to the same distribution as the estimated parameter. If con- 
strained by finite memory, some other model must be devised. One possible 
approach is to use only the last k observations. This idea has been in- 
vestigated by Robbins Pale 

Although not originally intended as a method for testing hypotheses, 
Robbins' model maximizes the long run expected number of "successes" given 
two alternative courses of action and finite memory. Suppose that an 
experimenter has two coins and that he wishes to maximize the number of 
heads thrown during a sequence of tosses. The minimum variance unbiased 
estimator of p is x and as the number of repeated trials increases, the 
variance of the estimator goes to zero. Therefore if an experimenter 
had prior knowledge of the probability of heads for each coin he would 
use the.coin with the ereater probability of heads exclusively and know 
Ath certainty that 


~ 


Limit number of heads in first n tosses = max(pj ,p2) 
* nN n 


where P, = probability of heads for the aH coin. Without prior know- 
ledge, and constrained by a finite memory, the experimenter must decide 
which coin to use on the basis of the results of the previous r trials. 
Robbins formulates a decision rule in the proof of the following theorem: 


“Define the rule R. as follows: start tossing with coin l. Stop 





if the first toss is tail, otherwise continue tossing until the first 
run of r successive tails occurs and then stop. This defines the 
first block of tosses with coin 1. Now start tossing with coin 2 

and apply the same rule, obtaining the first block of tosses with 
coin 2. Then start again with coin 1 and apply the same rule, obtain- 
ing the second block of tosses with coin 1, and so on indefinitely, 
thus generating an infinite sequence of tosses consisting of alter- 
nate blocks of tosses with coins 1 and 2. With rule R. so defined, 


we assert that 


Limit mumber of heads in first n tosses 
no n 


r cE 
Pi Ge + Po GH on 
r is ° 
qi + G 
Note that 
sat Yr 5 
Limit p, qe + Pe qy = max (pi ype) 
r-700 -_eNS nee 
r i, 
di + dp 
Using methods similar to the Robbins model, Cover [1] developed a 
4-state memory algorithm for testing the hypothesis p > po vs p< py, 
given a sequence of iid Bernoulli random variables. In the Cover model, 
the pair (T,Q) can take values in {0,1}. T keeps track of the currently 
favored hypothesis and Q records the results of the current run test. 


| ro) © 
Two sequences is}, and ea lin of positive intergers are considered. The 


sequences of observations are divided into blocks §S R S R 


Ue ee 


where Sy denotes the first S; observations, the next r, Dove 


greets 


>? and so on. 
T is initially arbitrary; if all observations in a block S. are equal to 


ime is set to 1. If all observations in a block R, are equal to0, Q 


is set to l. At the end of each block the currently favored hypothesis 


is updated by the rule 


TO = 1 if Q= 1 and nis at the end of an 5 block 


O if Q = 1 and nis at the end of an R block 
= T otherwise 
n-1 
The lengths of the blocks of S's and R's are determined as a function of 


Po? and Cover shows that the limiting probability of error under either 


hypothesis is zero. (See Figure 1). 


pri 


qh 
Figure l., 


A two state Markov chain where T can take on values in comes 


Although the memory size in Cover's model is now finite the updating rule 
still depends on n. 
The first genuine finite memory model has been proposed by Hellman and 


Gomer 13],[41]. They proposed a family of algorithms of the type 
T 
nh 


d 
n 


ee ee: {1 ,2.....,mp XL =|. 2am 


d(T) cle i oe cay 


TO denotes the statistic at time n, x is the value of the aco sample, 
f is a transition function and d is a decision function. Note that T 
is of finite memory since Te RP ee inl and given an initial value of 
the statistic, the sequence Tt forms a Markov chain over the state space 
M= a2 se. m} . The goal is to minimize the expected asymptotic pro- 


portion of errors 


n 
hn Fe i Or ee a 
P(e) = E { Limit aes } 


n7o 


10 


where 


ee 9 7 Serre 


O if d. =H 
TL true 

Hellman and Cover have established a lower bound for the proportion of 

of error. Let The and I, denote the prior probabilities of the null and 


alternate hypotheses. Let fic and f, be the probability densities of the 


oO 
sample under the respective hypothesis with respect to a dominating 
measure. Define the likelihood ratio to be £(x) = E(x) /& (x). Let 

&£ denote the ess sup of the likelihood ratio and £ the ess inf where the 
supremum and infimum are taken over all measureable sets with positive 


dominating measures. Define Y= £/£. Then for an irreducible m-state 


automaton, P(e) 2 P* where 


-].% 
Ce eae al, 


fa.) oe e m-1 : 
rr tf YATE max {Th Ty lg Typ 


= min TheThy} otherwise. 


Hellman and Cover further prove that a reducible** (m+l)-state automaton 
obeys the same bound on P(e) as anirreducible m-state automaton. 
If the prior probabilities of the null and alternate hypotheses are 


equally likely, that is if 1h. = Ty = %, then for an irreducible m-state 


automaton 
rene G2) ea ela Woes 
m-L 
Y - 1 
yy 1 
l 
L(m- 
5 (m 1) = 1 


**k We call the automaton reducible (irreducible) if the Markov chain Le } 
is reducible (irreducible). 


ll 





If the m-state automaton is reducible, the bound becomes at least as 
/ 


great as 


cl 
P(e) soar ae 


In the case of the Bernoulli trials, consider the two hypotheses 
case 
cep Py 
o 3: pe Pry where The = Te = 4; 


Without loss of generality it may be assumed that Pye 7 Py in which case 


a = Py 
B= rylpg i 2 = ay/dg i and y= Tk = 2? 


Further, if the hypothesis if symmetric, that is, if Pye = I-P. then 
2 
Y= al ; for an irreducible m-state automaton 


and for a reducible automaton 


While the lower bound cannot be achieved except in degenerate cases, 


P(e) = 


P(e) = 


Hellman and Cover demonstrate an ¢-optimal class of automata, that is, 


for every ¢>0 there exists an automaton such that P(e) = P*¥ + €é, 


Define: 
KoO= tx eX 2£(x) 2 [ (1/2) + 37 
Sy aeeare el x) Ce 
£ = fxeL: x EK, US, 7 


Let the transition function f, be specified as follows (see figure 2): 


2 


IU epee x € 
£Cigs) 1 xe & TO 9 cho The 
eran | xed 
€ 
yee Withmprobapmltty sO =)0 lf a end 
SM A otherwise 
inet Apuiclo jeieloleaton, iris, Nee (0 slog sk > 
A) otherwise 


8 #, Wve ; 7 te 
1 tk 2 % 3 = ieee | 
Figure 2. 


Transitions are made to adjacent states only when the events Ke or Je 
are observed. Thus the automaton enters an end state only on strong 
evidence to support that hypothesis. If 5 is allowed to become arbitrar- 
ily small, then the automaton tends to leave the end state with a very 
low probability. Decisions made in the end states have the least proba- 
bility of error, so as 6 ~ 0, the P(e) should asymptotically approach P*. 
While the Hellman-Cover algorithm is useful in producing sequences 
of decisions, the algorithm is not easily adapted to situations in which 
only a single decision is required. The irreducible automaton will 
asymptotically approach the lower bound for probability of error after 
a"large enough' number of observations; however, there is no easily de- 
fined rule which would specify when this number had been reached. It 
should also be noted that the Hellman-Cover automaton requires artificial 
randomization for transitions out of the end states. Some ancillary 
mechanism must be provided to achieve this desired randomization. In the 


case of a small computer, additional core storage would probably be 


i3 


required. The closer P(e) is to approach P*, the smaller is the prob- 

ability 6 which must be generated - which requires even more additional 
core storage. It is therefore believed that there are strong pragmatic 
reasons for adopting an algorithm with absorbing states despite higher 

asymptotic probability of error. 

In this paper a special class, awe of symmetric (2n + 3)-state al- 
gorithms with two absorbing states will be developed. Derivations and 
proofs within the paper are restricted to symmetric Bernoulli random 
variables, but it would also be feasible to extend these concepts to 


non-symmetric hypotheses and to distributions other than Bernoulli. 


14 





II. DESCRIPTION OF THE ALGORITHM 


Let X Xo » Xgye0e denote a sequence of independent identically dis- 


iL 
tributed Bernoulli random variables which can take on values H or T. 
Consider two hypotheses, K and J with equal prior probabilities and such 
that 


P(X, = H| 30 = P(X, = TIS ) =p, where 4<p<l. 


As the notation suggests, the sequence of random variables can be thought 
of as successive tosses of a coin which is biased towards Heads under 
hypothesis K or biased towards Tails under hypothesis J. 

Define the algorithm (M,f,d) (See Figure 3) such that M = {-(n+1), -n, 
wee, -1,0,1,...n,n¢1} with +(n+l) the two absorbing states and 0 the init- 
ial state; d(n+l) =X, d(-(m4+1l)) =3 , otherwise arbitrary; and the tran- 


Sition function,f, such that 


f(s,H) = s+l, f(s,T) = s-p(s) i fese—) 12h 
£(s,1) = s-l, f(s,H) = stp (s) it?s = \~ lee 
f(s,H) = l, f(s,T) = -l icf s —7 0 

mCS4ll) = Si £(s,1) = s if s = +(n+l1) 


iaementegers p(1),...,p(n) satisfy the inequality 1 = p(s) & s. 


Start 4e¥ 
d=J H H H r 2 i" i" qT 
eT CO y: 6 4 T ty . eS 4 e H 4 
Wy a H 
-~(nt+!) -n = -S+ p(s) -| O | s — p(s) S n n+l 


An Algorithm f = (p(1),...,p(n)) e¢ aa 


Figure 3. 


‘es. 





The specific form of the algorithm (M,f,d) will henceforth be denoted 
f = (p(1),...,9(n)). Figure 4 shows the algorithm and the transition 


matrix for the case when n = 4 and f = (1,1,2,2). 


Figure 4: The transition matrix for the case where n = 4 and f = (1,1,2,2). 


16 





IIIT. THE ¢-OPTIMAL STOCHASTIC AUTOMATON 


Although the class a consists of deterministic algorithms only, it 
can be easily shown that with randomization, the Hellman-Cover lower 
' 


bound could be approached arbitrarily closely. With the algorithm 


(M,f,d), the probability of error can be written 
PA + Pr(absorption at -(n+l)|K) + 4 Pr(absorption at n+l|9). 


Let Ps GC) denote the absorption in state i without return to j given that 
hypothesis H is true and P: (5) denote the corresponding probability given 


& where i = +(n+l). Then since 0 is a recurrent event 


ao Ae 1 70 ; 
Pp =% etsy So + % Pa CO which by symmetry 


e 
O 
7 P_ (nti) & 
. O O 
Given KH, we know that Pee Pay SO = 1, so 
O 
> ee Oe 
e 


PT (at 1) 6 a5 P41 


oe: 
J 
z E popes! 


(KX) ‘a 
O 
P_ (nti) © 


ci) 
Let £ = (1,2,3,...,n) and 1 > 6 > 0 and consider the absorption at n+l. 
If a Head is observed, move to the next higher state with probability 6, 


otherwise remain in that state. If a Tail if observed, return to O since 


mee) = Ss for all s. Since return to 0 is a recurrent event, 


1l.-n+l D. 
P° a aoe pnt - Cc. s poet eye ("4 )p n+3 grt oy. 6)7 


ss ants n+] E R (TF pa ye ees (.- -§)2 
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7 pitlantl ) (ne*)pk (L- -6)* 
k=0 
_ ts nu cae Bi 
Similarly, 
O = aie pnt 
Thus 


= 


n 
~ 
"a eeaca 
(q + 6p) 
Taking the limit as 6 > 0, we have 
2Zn+l _ -1l 
-[2+(2) J 
q 
which is the Hellman-Cover lower bound. Unfortunately this e-optimal 
automaton is of little practical use since the expected time to absorp- 


tion becomes infinite. 


18 





IV. DETERMINATION OF ERROR PROBABILITY 


To obtain an explicit expression for the probability of error in 


terms of the algorithm, we will first prove the following proposition. 


' 


Proposition 1: Let f = Go Cis a on Gay) 


n+l F (> £) 


at 
_ Pp ee ec 
Then =P = E ©) R (£)| , R(£) = F (eb) (2) 


Where F(x,f) is a polynomial in x of degree less than or equal to n and 


with integral coefficients. With initial conditions F _, = 0 and Fo = 1, 


I 
these polynomials satisfy the recurrence relationship 
= rte (n) 
FG £) = FLO f) = ex) SF) ny GF) (3) 
where n = 1,2,.... 
Proof: 
From (1) we know that 
O 
Come 
ise | 
P_ (nei) ©? 
Next note that 
O t / 
Pel OO = PP 4, 69 and that 
1 ~e _ aes 
Py GO = Eee where voa is’ the expected number of visits to 


n before a visit to ntl or O given that the chain was started in state l. 


From basic properties of Markov chains we know that Vy - [Sie as the 
> 


Gps where Q = Lp; ;! is 


aa entry of the fundamental matrix M 


an n X n matrix with entries Das =p if j i+1l, Pag = q if j = i-p(i); 


_ and Pai = 0 otherwise. (See Figure 5). 


Lg 


O 
© 

© 
© 
© 
© 


[) 
Oo 
oS 
AS) 
S 
© 


Figure 5: The matrix Q for the case when n = 6 and f = (1,1,1,2,3,3) 


Applying the formula for the inverse of a matrix, we have 


n+l -1 
Baten re (-1) - | (I-@ (1! | I-Q| 


where | ; | is the determinant operation and | (I-Q ¢, 1)! is the determin- 
b] 
ant of the @uavee cofactor transposed. By deleting the first column and 


a row of the matrix (I-Q), the submatrix (I-Q) (, is lower triangular 


oly 
n-l 


with entries -p on the diagonal. Hence its determinant is equal to (-p) : 
Substituting, 


y Z CoumeCDrt \tol - 


l,n 


aie: liege) 


If we denote | r-Q| = Fp, £) and repeat the entire argument with p and q 


interchanged, then 


v ates [F (p,£)]"* 


ln 


n-l -1 
Mel =n dq LF _(q,£)J 


Multiplying and substituting into (1), we have that 


P n+l F_(q; £) -1 
P = E +( ) ——— which was to be proved. 
e q F_ (p>) 2 


20 





To establish the recurrance formula for FA (see Figure 6), expand the deter- 
; th th 

minant along the n row. If p(n) =n, then the n row has all zeroes 

except a lon the diagonal; Ee = Fi-l and (3) holds. If p(n) < n, expan- 


sion along the aa column gives 
F (ps £) = Fe _ptP> £) a p| 1-Q| (n-1,n) 
Expand }T-QI 4 ~ along the last column and repeating this p(n)-l times 
3 
yields 
e p (n) 
F(p,f) = F__j(p,f) +p D 
The determinant D has all zeroes in the last row except the diagonal entry 
which is -q. Therefore, 
D = “EB otis Gay bom and 
= _ _ap (n) 
pest) BE (P2t) = ap Fico) 


Checking the initial conditions completes the proof. 


ep O © © 
-q l-p O O 
Fe(p,f) =|-q 0 l-p 0 
O-q O 1 -p 
eee REG) el 
pe omro 
F.(p,£) = F,(p,f) +p|74 7 P © 
5) ee 4? -q 0 l1-p 
‘ O-q O O 
9 [=p 0 
= F,(p,f) + p|-q 1 -p 
O-q O 








= 3 | 1 -p 
= F,(p,f) +p Ee | 


3 
= F, (p, £) am )9 qF, (pf) 


HieUce) Ome tne srecunrance wellatd onship jf = (ll 252 oe 


21 





fest, call 


Rae = max Rf) = R(f*) 


fea 
n 


Then from (2) 


n+l = 
p* = min Pp =(1 + (2) R* | 
e e q n 


Mrewe take f = 1,2,3,...,n , then RC) = 1 so that R* = 1. 


1 


The Hellman-Cover lower bound states that 


2n+l _<-1 


/ n+l -l 
+) xe] =ir+G) | 
q n q 
n+1 2n+l n 
which implies that (2) Roe (2) and R* Ss (2) 
\q n q n q 


n 
Thus l= R*s (2) ° 
n q 


Ze 


V. REFINING THE BOUNDS FOR a 


| n 
In the preceeding paragraph, it was shown that 1 > Ras S & = aiets 
would now be useful to see if Ro exists in a form which can be expressed 


as a limit for large n. To do so, proposition 2 is proved. 


Proposition 2: 
Let f. = 8 OD poe say yee OTe — lee metic 


= (9 (1), ---5P (ny + n,))ea be such that p(s) =  Seronise — l. 


nytng 
Then R * =R (f,)R_ (f£,). 
n,n, ny i n, 2, 
Proof: 
p 1 (k) pe ie Bae 
Let p(k) = ny +l ei k= Stiga tok 
P 5 (k-n}) ifk=n, + 2,...,n) +n, 


First prove by induction on n, that 


Fy, Of, )Fy, Gs £5) (4) 


F (x, £) 
i 


mone — 1, then FO = 1. But since p (n, +1) =" 


9 +1, we know that (4) holds. 
2 


if 


menace) is true for all ny = theeebhnen 2 t ny = n+l from the recurrance rela- 
tionship we know that 

(ny) 

Taig oe = ‘age - x CIx)Fr an=p (ay) 


By the induction hypothesis, 


can = Fy CORE Sy and 


Fay tnep (n) #7? ~ Fa Pe ice a) 


Ss 
ila p,(n,) n, then 


Pp (n5) 
F yee = Tao 2 ~ xX (l-x)F 


aa Ge £) 


n,tn-p, (ny) 
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p,(n,) 
ade aay ee ee aT 


Telia) oe 
Pp, (n,) 
Fy, Of ELF, Ges fp) - x (1l-x)F 


n-p ”) (n,) (x fy)! 


Fy OEP, (fp) 


which completes the proof of (4). 
Equation (4) has been proved true which implies that 


os = _ i 
Ray in ere a coeean 2 which completes the proof. 


Next note that once an algorithm has been found such that R(f) > 1 and 

< > w = , it i “ 
p(s) Suror s 1, that & Raye Qn a + In Sy Oe 1? se) Sle) ei 
sumed that all optimal algorithms have this property from some n on, then 


im R* will be a monotonically increasing convex function. From the Hell- 


man-Cover bound, 


n 
we® 
n q 


If & R* is monotonically increasing, then the Limit ar exists, is 
positive, and is bounded above by In 


To improve of the lower asymptotic bound for R_*, consider the algorithm 
n 


f = (p(1),.-.,p (nm) €a_, with p(s) = 1 for s = 1,...,k and p(s) = s-k for 
Seek l,...,n; where 1 =k Sn. From the recurrance relationship for F, 
we have 


FG) = Fy) - (xx Fw) 


1) = Fo (x)= (exe TF Ge 


n= 
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Fg) &) = FL Gx) . (1-x)xF, | Gx) 


To establish an expression for FL Cx), we will prove by induction that 


FG) = [ad B (roxy FP | (2x-1)7 fora cie 


Proof: 
2 2 
ee Cl) eee 
Note that F) (x) See ap Pan tee ee 
© (CLE es Sexe 3 2x ~3x°43x-1 _ Teas 
Z 2x-1 2x-1 


By the induction hypothesis, 
Bes = Eat ae Ceci fOr jak 
j j -1 
Fy) = | x - (1-x)J] (2x-1) 
From the recurrance relationship, 
Pare = BE) - Es AC8) 


ae 1-x)LxJ- l-x J] 
2x-1 


_ Gl ese eels 
Zee 


-(l-x 


— 
—_ 


: eG 
2x-1 


which completes the proof of (6). 


From (5) and (6) 
_— k+l RL yg) Caqze see tay (gk pk 
oe der | 
k+l_ Sa a eel )( k_ S 
q-p 
- EK . : A adie k_ 5 
q-p 

k n-k+1l,k k -l 

=p +4q (p -q )(p - q) 
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Similarly, 


k -k+1l,k k -1 
F @,f) = 4 +p" (pe aq) (pag) and 
k " Wolgrall k 7 -l 
R_ (f) = 
n k n-k+l1, k k -1 
ap ar je (Bea ¢ ) (p= a) 
If aw = . remains constant, then 
on n-On+l, on en -l 
Pp + gq (p - q )(p- q) 
R(f) ~ on n-On+l, On on -l 
q +p (p> =a pee) 
= a 24 
-(2) "2+ ayn n(l 2) eee aes : 
q -an n(l-@ On on 
be q @mynGo) yom gen) __P 
p q 
on n On oie 
(2) 1 + - 
q/ 
an -on 
Ltt pp G@po op) ae 
P- q 
Yn Yn 
wee. es 9 
Pa od Yn on pP-q 
- (2) " an an 
[nee (Re ee 
On On punz-d 
P q 
l-a .n 
a ehe —) (pa ead) p-q 
(2 
: Ee a on on 
1.+ ) Gp oc a 
os Peed 
q 
n 
ifeveZar n 
mey (1) eis ; os a cae) 7 (—s_) 
n p- ae 


But S28 1 for all a, so Limit UL = 0. 


Pp n-?© 
l-a@ u 
on on 
bet v,=( 2.) @™- a 
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> ee =~ O for sald Oo. out a may be greater than or equal to 1 depend- 


q 
me on the choice of @. 


tif ——=1, 
04 
q 
In = 0: Insp =te in q = 0. ger fe 
Y nq 
q 
ie — >1, 
Y 
q 
- in p 
ieee Lite Ge 9 Ol Ci iamee 
i <1, 
eg 
q 
ln 
i = ve Ee alee 
Inp-a@I1nqgq OR es Ree 
0 < 
Thus Limit V. =4 1 pe) te ey sd 
n-o n In q 
CO > 
Substituting, 
7 l+u —t— 
“On np- q 
Limit R (2) = ha.6. 
np- q 
= 0 i gy 
| In q 
= 1 tf. O= aL) 
In q 
a 1 
P- q 


epee ifq = 2 P 
yo) > In q 


Then since R* 2 R (£), ee increases asymptotically at least as fast as 
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With the previous assumptions regarding the optimal form of the algorithm, 


we can now state that 


In R * 1 
eam t see. See ae : 
n In q q 


re 
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VI. DETERMINATION OF ieee AND £* FOR SMALL n 


To establish asymptotic bounds on Rs it was necessary to assume that 
from some n on, optimal algorithms were such that R* > 1 and p(s) < s for 
sl. 

To test this assumption, R (f) was Ccalcullated for m = 1) 2-3 mands4- 
For each case R* was determined algebraically. For the values of 


pe (4,1) it was found that 


Ri” = Ro* = 1 for any f 
pee 
R,* = P41. with f£* = (1,1,2) 
q +p 
fatten Opec 
Re = ag AE with £* = (1,1,2,2) 


1 - pq - 2p q 

The calculations used to determine these values of R* are at Appendix A. 
‘For values of n from 5 to 9, several values of p > 4% were chosen and 

the search was performed using an IBM 360/67 in double precision. The 

program is at as For the vicinity of 1, values of p of .99 and 

.999 were used. In the vicinity of 4, the optimal algorithm was deter- 

1 


mined by a Taylor series expansion around 4% + €, and neglecting terms of 


order  . By neglecting terms of order ou 


FCs) - €F G) 








BEC.) =a erat 
n F) a EEC) 
FG) 
-€ 
FG) 
- 
FG) 
il t—— 
* °F @) 
F’(s) 
To maximize R_(f), minimize aw 
n PCa) 
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Since F is a polynomial in x of degree less than or equal ton, 


n 
< e 
F(x) =) a(n,j)x’, 


j=0 


where a(*,*) denotes the coefficients of the polynomial. Differentiating, 


n 
FM(x) =) ja(nyg)xt", 
j=l 


forming the ratio yields, 


nh 





j-l 
F'G) By jan, G4 
F (4) on 
; 2p ata. i) G)° 
Er 


3, de@.D2 7" 


n - 
e al 
70 a(n,j)2 


The program for the search of the ratio of these polynomials is at 
Appendix C. 

The results are summarized in Table I and figures 7,8, and 9. They 
seem to confirm the assumptions made in proposition 2 and further indicate 
that the optimal form of the algorithm is probably of the form 

fee Cl lies cera ely Dy eveyahs 25 Sighs. cr. 5 34 A gaseuete) 
with the lengths of the blocks of constant p(s) depending on n and p. 

Computer run time precludes a complete search of algorithms for n 
much larger than 9; however, it may be useful to assume that f* is of the 
meme 1. , 1, 2,-6e52535+-6) and to maximize RA by manipulating the lengths 
of blocks of the p's. Such an algorithm is also intuitively appealing. 
Near the initial state the information content of an event is low as 


characterized by p(s) = 1. As the automaton approaches the decision 
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point (absorption), p(s) increases as if the information content of a 
negative event had increased. In some respects this seems to be similar 


to the human decision process. 
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Figure: 92 
Natural logarithm of R* versus p. 
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For the case when n = 3, f.,* 


APPENDIX A 


Calculation Of Rowtomen—o ane n=4. 


(1a 2) ean 


3 
1 - 2p + Sy ~ 5S 
R,* = 
3 | 
Per abe Pp 
EEoo lL: 
ef a = (1,1,1), then Fe = Il 
2 5 
Lf a Smet). then p2) a i - 2p + 3p - p. 
3 3 3 
L=-p+p 
3p > 1 + 2p Foie) les) = pe al 
Z 3 
SNE Se ae 749 
l- 2p+3p -p 7 l-p+tp 
2 3 
ienczp + 3p = Dies 4 
l--p+p 
| (2) an) ane) 
There fore R, R, » and R,” = R, - 
For the case of n= 4, £ * = (1,1,2,2), and 
R* = l - pq - dpa” _ lL- 3p + 5p. - Da 
i 2 7 2 3 
1 - pq - 2p q t-p-p + 2p 
Proof: 
If ae = <1,1,1,1), then Re” ke 
Slip SD aoe ell eee <A 
ene - ee > 2p; adding 1 - 3p - Bo + rg 
3 


Z 
l- 3p+5p - Di 2 )L seyiayo ie oa op 
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2 3} 
b= Spa >] 


Wee) ce p ae 2p” 
e (1) 
Thus Ry > Ry 


2 3 
12 oo el ee) eden ee = oe Spas DEP 


1 - 2p + Bo + a 
3 2 4 ; 
6p + 15p > 14+ 14p° + 6p formed lie 99a 
SORES 5p) SARE: pe 


adding 1 - 5p + Gar - ge = 35 + 9p? = oe 


4 5 
1 - Sp + Tape - aa + 6p + 3p - ap" 
> 1 - 4p + ene - a8 + aha - Oe 
2 5 Z 3 
Sige Op - 2p )(l- 2p+p +p) 


2 3 2 3 
2 (l= Spr 4p =p ) (Cl =ape pee pe) 


|. 3 eee Lo ee Bae 
9 Ge nS 
= p- p + 2p l- 2p+p +p 


- (2) 
Thus R,* > Ry 


2 Sf 
fe = (1.1.1.3), RO _ 1 - 3p + 5p. - 3p + 


3 
l- 2p + ee -p +p 


CoG oan ne > Gn ohe aa foreallys—p — le 
2 
6p Py De” > RG h Oyen. 
Adding en 1a pe lon 7p = 2p. 


Msp --19p +18p -12p +7p°-2p- > lies Shy ae JS ee 


3 Dh ys a aE 
(1-3p+5p -2p')(1-2p+2p -p+p ) > (1-3pt5p -3p°+p )(1l-p-p +2p ) 


L-3pt5p"-2p> | 1-3p+5p-3p "4p" 
s 
l-p-p +2p een Sa 


e (3) 
Thus R,* = Ry : 
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2 3.4 
Lf 5 64) om lee elies Re = pa ewe 


Leen 
SA?) 
~ Ry, 
. 4 
Thus, Ree RS ), 
(5) (Gyn ee 
mae = (1,1,2,3), R = 
4 1 
-p+p 


Owais S Peet vay” 
Oe teas > Dee Ma 
Adding ee 7p Spee dp ele. 


oa ae opie sree op. > Aes ys nlGiee Ona" 


eS 4 yee sya Vf ns 
(1-3p+5p -2p )(l-ptp ) > (1-3p+6p -4p +p )(1-p-p +2p ) 


Ne aeNSe eae = 1 eetane yeaa 
3 
L-p-p-+2p FSp-Ep 


Thus le tae which completes the proof. 
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